CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application

نویسندگان

چکیده

This study presents a deep learning-based speech signal-processing mobile application known as CITISEN. The CITISEN can perform three functions: enhancement (SE), model adaptation (MA), and background noise conversion (BNC), which allow to be used platform for utilizing evaluating SE models flexibly extend the address various environments users. For SE, downloads pretrained on cloud server then uses these effectively reduce components from prerecordings or instant recordings provided by When it encounters noisy signals with unknown speakers types, MA function allows improve performance effectively. A few audio files of unseen types are recorded uploaded adapt model. Finally, BNC, removes original using an mixes processed signal new noise. novel BNC evaluate under specific conditions, cover people’s tracks, provide entertainment. experimental results confirmed effectiveness MA, functions. Compared signals, enhanced achieved about 6% 33% improvements, respectively, in terms short-time objective intelligibility (STOI) perceptual evaluation quality (PESQ). With STOI PESQ could further improved approximately 11%, respectively. Note that method not limited ones described this replaced any method. experiment indicated converted backgrounds have close scene identification accuracy similar embeddings acoustic classification Therefore, proposed convert data augmentation when clean unavailable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Learning Based Speech Beamforming

Multi-channel speech enhancement with ad-hoc sensors has been a challenging task. Speech model guided beamforming algorithms are able to recover natural sounding speech, but the speech models tend to be oversimplified or the inference would otherwise be too complicated. On the other hand, deep learning based enhancement approaches are able to learn complicated speech distributions and perform e...

متن کامل

Deep learning-based CAD systems for mammography: A review article

Breast cancer is one of the most common types of cancer in women. Screening mammography is a low‑dose X‑ray examination of breasts, which is conducted to detect breast cancer at early stages when the cancerous tumor is too small to be felt as a lump. Screening mammography is conducted for women with no symptoms of breast cancer, for early detection of cancer when the cancer is most treatable an...

متن کامل

Speech Signal Processing

Here we present the C++ library SPC (Speech Signal Processing Classes) as development tool for assembling of speech processing applications. SPC offers real-time processing, batch processing of large databases, visualization, and analysis of signals between processing steps. In SPC the data stream occurring in speech processing is partitioned in three different information flows: signal data, c...

متن کامل

D Speech Signal Processing

consisted of seven Ph.D. students (five of whom were located at the department), a part-time (20%) researcher (forskarassistent), two guest researchers, and a professor. The group performs research encompassed within speech processing, signal processing, and source coding and teaches two undergraduate courses (Information Theory and Source Coding, and Digital Speech Signal Processing), in addit...

متن کامل

Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement

We propose a multi-objective framework to learn both secondary targets not directly related to the intended task of speech enhancement (SE) and the primary target of the clean log-power spectra (LPS) features to be used directly for constructing the enhanced speech signals. In deep neural network (DNN) based SE we introduce an auxiliary structure to learn secondary continuous features, such as ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2022

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2022.3153469